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I.  INTRODUCTION 


The  traditional  approximations,  formulas,  and  techniques  used  in  item 

analysis  (Kelley,  1939)  were  geared  to  save  computational  labor  at  the 

expense  of  accuracy  and  amount  of  information  about  achievement  tests, 

items,  and  the  individuals  taking  the  test.  In  view  of  the  available 

assistance  of  modern  high  speed  computers  it  became  possible  to  develop  a 

s 

more  sophisticated,  accurate  and  detailed  mathematical  approach  (Baker, 

1964,  1965)  which  provides  test  constructors  with  more  flexibility, 

greater  accuracy,  and  detailed  additional  information  necessary  to 

I 

improve  the  evaluation  and  hence  the  quality  of  the  items  and  tests. 

This  paper  describes  the  technical  details  that  are  required  for  the 
use  of  the  IAP  program  as  it  is  operational  on  a  UNI VAC  1100/81  computer 
system  at  the  Manpower  and  Personnel  Division  of  the  Air  Force  Human 
Resources  Laboratory,  Brooks  Air  Force  Base,  Texas.  The  basic  concepts 
and  general  information  are  first  provided.  Detailed  instructions  for 
the  preparation  of  IAP  control  cards  by  the  user  are  provided  in  the  next 
section.  The  appendices  contain  the  computational  formulas  used, 
mathematical  derivations  including  some  proofs,  and  a  sample  run. 

Basic  Concept  of  IAP 

The  basic  concept  in  modern  item  analysis  is  the  item  characteristic 
curve  (Binet  and  Simon,  1916)  and  its  associated  parameters  (Tucker, 
1946).  The  curve  is  essentially  a  1  ine  fitted  through  the  points 
obtained  by  plotting  the  proportion  of  respondents  to  a  particular  item 
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or  item-alternative  against  a  given  criterion  score  or  "ability -score 


expressed  in  standard  z-scores.  Figure  1  is  an  example  of  such  an  item 
characteristic  curve.  This  curve  is  a  cumulative  distribution  function 
of  two  parameters,  X50  and  3 .  Fitting  a  normal  ogive  through  the  points, 
X50  is  the  point  at  which  50 %  of  the  respondents  passed  the  item.  The 
corresponding  z-score  is  the  "ability"  at  which  the  item  discriminates 
the  best.  Beta  (3)  is  an  indicator  of  the  strength  of  discrim  nation; 
i.e.,  the  larger  the  3 ,  the  sharper  the  discrimination.  Beta  is 
conceptually  the  slope  of  the  line  drawn  to  the  item  characteristic  curve 
at  the  point  X50.  Mathematically  it  can  be  shown  that  g=  l/o;  i.e.,  the 
reciprocal  of  the  standard  deviation  of  the  normal  ogive.  The  use  of  X50 
and  B  provides  the  scientist  with  great  versatility  and  flexibility. 

They  enable  one  to  draw  specific  inferences  about  a  given  individual  and 
a  given  item,  choose  items  which  have  optimum  discrimination  power  at  a 
certain  ability  level,  screen-off  a  certain  percent  of  a  group  of 
examinees,  estimate  the  "true  score"  of  an  individual,  and  compute  the 
probability  of  the  correct  response.  The  advantages  are  so  numerous  and 
broad  that  only  through  use  can  the  program  be  fully  realized  and 
understood.  Two  of  the  many  applications  are  briefly  discussed  below, 
referencing  Figure  1. 
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•igure  1  EXAMPLE.  OF  A  TYPICAI.  ITHM  CHARACTERISTIC  CURVE 
X50  =  1.0,  8  =  .68,  I1H1  DI FFICULTY  =  .44  . 


;  =  3 
n  =  .91 


j!  X50  =  1.0| 

|8  =  slope  line  £=  .68 


Standardized 

Criterioij-Score 


Total  number  of  i 
Respondents  for  this 


Respondents  who 
passed  this  item 


Corresponding 
proportions  passing 
the  item 


*  ITEM  DIFFICULTY 


73 

80 

74 

19 

40 

54 

19/73 

=  .26 

40/80 

=  .50 

54/74 

=  .73 

141  * 


1.  Probability  of  a  correct  response 

The  probability  of  correct  response  of  an  individual  for  the  given 
item  can  be  read  directly  from  Figure  1  providing  that  the  individual's 
criterion  score  is  known.  For  example,  the  probability  of  a  correct  response 
for  a  respondent  with  z  =  3  (abscissa  value)  is  .91. 

This  probability  can  also  be  computed  knowing  X50,  B  ,  and  z-score  as 
follows  (using  X50  and  from  Figure  1): 

z  =  [-Kz  score  -  X50) 

=  .68  (3  -  1)  =  1.36 

From  a  cumulative  normal  distribution  table,  the  area  corresponding 
to  z  =  1.36  equals  approximately  .91. 

Thus,  the  probability  that  the  individual  with  a  criterion  score 
z  =  3  will  pass  this  item  is  P  =  .91. 

Similar  computations  may  be  carried  out  for  respondents  with  any 
given  criterion  score.  Figure  1  shows  these  probabilities  (proportions)  for 
various  standardized  criterion  scores  (z-scores). 
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2.  Selecting  a  certain  percent  of  the  examinees 


Suppose  an  achievement  test  is  administered  to  a  group  of  individuals 
and  subsequent  item  analysis  provides  the  standard  deviation  (in  standard 
score)  of  the  test  and,  among  other  information,  the  X50  values  for  each 
item.  Furthermore,  suppose  that  the  upper  16?  of  the  examinees  is  desired  to 
be  selected  from  the  rest  of  the  group. 

This  can  be  accomplished  by  choosing  items  with  certain  X50  values. 
The  upper  16?  represents  an  area  of  (100-16)  =  84  percent.  The  corresponding 
standard  score  in  the  cumulative  normal  frequency  distribution  table  is  z  = 
1.0.  The  items  to  be  chosen  should  be  those  which  have  X50  =  (1.0)  (*  .  If 
=  1.5,  the  items  should  have  X50  -  z.  -  (1) (1.5)  =  1.5. 

The  maximum  number  of  discriminations  occur  at  the  point  at  which  50? 
of  the  examinees  pass  the  item;  i.e.,  between  84?  and  100?  in  the  above 
example;  thus,  the  upper  16?  is  selected  hv  maximum  discrimination. 

The  traditional  item-analysis  technique  provides  only  the  item 
difficulty  in  terms  of  proportion  of  the  total  respondents  who  choose  a 
particular  response,  and  the  item-criterion  correlation,  and  no  information  is 
available  describing  how  a  particular  item  or  item-alternative  functioned. 
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II.  DESCRIPTION  OF  THE  IAP  PROGRAM 


General  Information 


The  criterion  upon  which  the  program  bases  all  the  statistical  analyses  is 
specified  by  the  user.  This  criterion  may  be  either  internal  or  external. 

The  internal  criterion  is  the  total  test  score  of  each  individual  taking 
the  test.  These  scores  may  be  used  as  raw  scores  or  may  be  corrected  for 
guessing  according  to  the  user's  specification.  All  item  and  test  statistics 
are  calculated  accordingly.  An  exception  is  made  in  the  case  of  the  frequency 
distribution  of  an  internal  criterion,  where  the  user  may  specify  either  raw 
score-distribution  or  corrected  for  guessing  score-distribution  regardless  of 
the  user's  specification  for  the  type  of  scores  tc  be  used  in  the  item 
analysis.  However,  if  the  user  specifies  the  computation  of  phi-coefficients, 
the  exception  previously  mentioned  is  not  available  and  both  the  phi  and  the 
frequency  distribution  are  based  on  the  same  criterion  used  in  the  computation 
of  item  statistics. 

The  external  criterion  is  furnished  by  the  user.  If  an  external  criterion 
is  specified  on  the  control  card,  the  program  will  use  it  in  the  item 
analysis.  The  validity  coefficient  between  the  external  criterion  and  the 
internal  criterion  is  calculated  using  raw  scores  or  corrected  scores 
depending  upon  the  user's  specification. 

The  test  may  be  graded  and  analyzed  as  a  power  test  or  as  a  speed  test. 

In  the  latter  case,  only  those  respondents  reaching  a  particular  item  will  be 


6 


considered  in  the  analysis  of  that  item,  including  the  item  difficulty  ana 
response  distribution.  In  addition,  the  mean  and  standard  deviation  of  the 
population  reaching  the  item  will  be  given.  Since  the  X50  represents  a 
z-score  of  the  population  reaching  the  item,  it  is  not  necessarily  comparable 
to  an  X50  obtained  on  another  item  from  a  different  population.  An  equivalent 
Xt>0  is  computed  by  making  a  z-score  transformation  from  tne  "item  population" 
to  the  total  population: 

<5" .  _  x  c 

Equivalent  X5 '0  ~  — - —  -X 50  ■+  — - - 

where  z  standard  deviation  of  total  population 

t 

z  standara  deviation  of  Item  population 
i 

At  z  mean  of  total  population 

z  mean  of  item  population 

XSO  is  based  on  the  item  population 

This  "Equivalent  X50"  is  printed  in  the  item  analysis  summary  table. 

Frie  factor  analysis  is  done  on  a  tetrachoric  inter-item  correlation 
matrix.  It  is  always  based  on  the  total  number  of  cases  regardless  of  whether 
or  not  all  finished  the  test.  An  individual  who  does  not  reach  an  item  is 
considered  to  have  missed  it  in  this  portion  of  the  program.  The  factor  analysis 
i  .  i  in  im  iji.il  component  analysis  with  Ver  imax  rotation  and  unit  diagonal  elements. 
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Computation  of  the  tetrachoric  matrix  is  the  slowest  part  of  the  program, 
particularly  if  the  number  of  items  is  large. 

The  test  reliability  is  influenced  by  the  sum  of  item  variances 
N 

(p . q  . ) .Here,  the  proportion  answering  Item  i  correctly,  p.,  is  always 
i=l  1  1  1 

based  on  all  the  cases.  This  makes  interpretation  of  the  reliability 

coefficient  in  the  case  of  a  speed  test  questionable.  (The  same  can  he  .-.aid 

about  the  factor  analysis  part  of  the  program. ) 

There  is  an  option  designed  to  handle  test  items  for  which  there  is  more 
than  one  correct  answer;  in  such  cases  the  various  responses  receive  different 
score  points  of  credit.  These  items  have  alphanumeric  responses  in  order  to 
be  able  to  handle  a  larger  number  of  possible  responses.  This  option  would 
generally  be  used  with  multiple-response  tests.  The  analysis  of  the  items 
considers  any  credit  achieved  on  an  item  as  passing  the  item,  and  no  credit  at 
all  as  missing  the  item.  This  is  rather  arbitrary,  and  could  be  changed  if 
desired.  There  is  no  correction  for  guessing  on  alphanumeric  response  items. 
The  amount  of  credit  to  be  received  for  a  particular  response  to  a  particular 
item  is  read  in  on  control  cards  (Card  5).  This  option  could  be  used  to 
weight  responses  differentially.  Another  option  provides  an  Item  Alternative 
Information  Roster  containing  information  about  the  validity  and  difficulty  of 
each  item-alternative.  The  validity  is  given  as  the  point-biserial 
correlation  between  the  particular  alternative  and  the  criterion.  The 
difficulty  is  expressed  as  the  proportion  of  the  sample  choosing  a  particular 
alternative.  Additionally,  the  inter-correlation  matrix  of  the  alternatives 
within  each  item  is  provided. 
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Ptc- £c •  ration  of  Input  Data 

The  input  mr_,  ue  either  card  or  tape,  as  indicated  on  Control  Card  1, 

Col.  Hi).  The  FORTRAN  logical  unit  number  for  the  input  unit  is  to  be 
specified  in  Col.  6-7  of  Card  1. 

The  format  for  reading  in  the  data  is  specified  by  the  user  (Card  3)»  The 
restrictions  on  the  input  data  are  as  follows: 

The  ID  variable  must  be  either  the  first  or  the  last  word  read,  as 
specified  on  Card  1,  Col.  36. 

The  external  criterion  must  be  the  word  immediately  preceaing  the 
responses,  if  there  is  an  external  criterion. 

The  ID  is  read  in  by  "A"  format;  the  numeric  responses  by  "I”  format;  and 
the  alphanumeric  responses  (if  any)  by  "A"  format.  The  external  criterion  is 
read  in  by  "F"  format. 

If  an  item  is  not  attempted,  it  is  to  be  coded  "0"  for  both  numeric  and 
alphanumeric  items.  This  is  important  for  tests  that  are  to  be  corrected  for 

guessing . 


On  .i  speed  test,  the  first  relevant  item  after  the  last  attempted  item  is 
to  be  coded  a  "9"  if  the  item  is  a  numeric  response  item, and  a  "W"  it  the  item 
is  an  alphanumeric  response  item.  (Items  coded  "0"  on  Card  4 ,  which  are  not 
considered  on  the  test,  do  not  apply  here.  See  write-up  for  Card  4.)  If 
desired,  ail  items  following  the  last  attempted  item  can  be  so  coded. 
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Card  or  Tape  Output 

If  desired,  the  following  information  can  be  listed  on  cards  or  tape: 

Test  ID,  case  ID,  score,  corrected  score,  external  criterion  score,  and 
1/0's  for  pass/fail  for  each  item.  Non-applicable  information  will  be  written 
zero. 

The  format  for  card  output  is: 

(A6,  2X,  A6,  2X,  13,  2X,  2F8.2,  2X,  ^111/8011/7911) 

The  Format  for  tape  output  is: 

( 2A 6,  13,  2F7.2,  20011) 
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111.  Control  Cards 


4  11  1UPL1M  Highest  response  choice  (not  including 

multiple-response  answers)  may  not  exceed 
9.  The  response  choices  have  to  be 
consecutive  positive  integers,  the  largest 
oi  which  is  IUPL1M. 


5  II  NFM  Number  ot  format  cards  tor  input  data. 

Default  when  blank,  NFM  =  1. 

b-7  12  K1  FORTRAN  logical  unit  number  tor  data  input 

(may  be  card  reader  or  tape  unit).  ALL 
FORTRAN  unit  numbers  should  be  greater  than 
9  on  the  UNIVAC  1100/81. 

8-9  12  K0  FORTRAN  logical  unit  number  tor  tape 

output,  it  applicable.  May  be  left  blank. 

10-11  12  KS  FORTRAN  logical  unit  number  tor  scratch 

tape  (serves  as  working  storage  tor 


program) . 

I  I 


12-13  I 2  NB  Number  of  bits  in  a  word  on  the  computer, 

noi  counting  the  sign  bit.  (NB  is  used 
in  the  word-packing  routine.)  if  blank,  NB 
will  be  set  to  35. 

1**-15  F2.2  EASY  Specified  difficulty  level  for  identifying 

too-easy  items,  in  percent  (two  digits  with 
no  decimal  point). ^ 

16-17  F2.2  DIFFLT  Specified  difficulty  level  for  identifying 

too-diff icult  items,  in  percent  (two  digits 
with  no  decimal  point). 

NOTE:  Default  when  both  EASY  and  DIFFLT 
are  blank;  EASY  =  .8  and  DIFFLT  =  .2. 

18  II  IALPHA  0  if  all  items  are  single -response 

(numeric ) . 

1  if  one  or  more  items  are 
multiple-response  (alphanumeric) . 

19  II  NSPEED  0  if  a  power  test. 

1  if  a  speed  test. 


This  option  and  the  next  one  (DIFFLT)  prints  out  items  whose  difficulty 
is  outside  of  the  specified  difficulty-range. 
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I  to  KGS 


KIV 


ICR  IT 


OUT 


NOCASE 


1 


0  li  scores  art  not  lo  be  corrccteu  lor 
guessing. 

1  it  scores  are  to  be  corrected  lor 
guessing  bv  either  standard  formula  or 
Hamilton's  (1950)  formula  (see  column  45). 
It  corrected  scores  are  called  lor,  they 
will  be  used  in  all  item  analysis,  unless 
an  external  criterion  is  used. 

0  it  negative  corrected  scores  are  not  to 
l>e  set  to  zero. 

1  it  negative  corrected  scores  are  to  be 
set  to  zero. 

0  it  test  score  is  to  be  used  as  the 
criterion. 

1  if  an  external  criterion  is  to  be  used 
(in  this  case,  all  item  analyses  will  be 
based  on  tins  criterion.) 

Code  lor  missing  criterion  score,  in  the 
case  ot  an  external  criterion.  Must  be  an 
integer  value  (no  decimal  point).  Cases 
with  missing  criterion  score  will  be 
excluded  1  rum  the  analysis  anu  printed  out. 

O  it  case  scores  are  to  be  printed. 

1  il  printing  ot  case  scores  is  to  be 
suppressed. 
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27  II  IPHI  0  if  no  phi  correlation  coefficient  is 

desired . 


1  if  phi  based  on  median  score  for  the 
sample  is  desired. 

2  if  phi  based  on  median  is  desired.  (In 
case  of  a  speed  test,  the  mean  will  be 
computed  based  upon  only  those  individuals 
who  reached  the  particular  item.) 

'  v 

t  *Vv. 

28  II  IFREQ  0  if  no  frequency  distribution  of  scores  is 

> 

desired . 

1  if  frequency  distribution  is  to  be  done 
with  external  criterion  score  (if 

;  applicable). 

2  if  frequency  distribution  is  to  be  done 
with  corrected  test  scores  (if  applicable). 

3  if  frequency  distribution  is  to  be  done 
with  raw  test  scores. 

NOTE :  If  phi  with  median  is  called  for, 
the  frequency  distribution  will  be  done 
accordingly,  regardless  of  user's 
specification,  since  the  median  is 
calculated  from  the  frequency 
distribution.  For  example,  if  phi  with 
median  is  desired,  and  an  external 
criterion  is  being  used,  IFREQ  will  be  set 
to  1  automatically. 
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29  II  LOVfcK  0  il  Henrysson's  (1963)  method  tor  overlap 

correction  ol  item  analysis  with  internal 
criterion  is  to  be  used. 

1  il  Guilford's  (  1933,  1965)  method  is  to 
be  used. 

2  it  no  overlap  correction  is  desired. 

NOTE :  It  overlap  correction  is  called  tor, 

both  the  uncorrected  and  the  corrected 
values  will  be  printed. 

30  11  JPLOT  0  it  only  the  proportion  ot  individuals 

passing  each  item  at  various  z-score 
(standard  score)  levels  is  to  be  printed. 
(No  plot  .) 

1  if  the  (titted)  item  characteristic 
curves  are  to  be  plotted. 

NOTE :  On  a  speed  test,  proportions  will  be 

based  on  only  on  those  individuals  who 
reached  the  particular  item. 

31-32  12  NF  Number  ot  tactors  to  be  extracted  trom 

tetrachoric  inter-item  correlation  matrix. 
It  NF  is  specified  as  zero,  the  inter-item 
correlation  matrix  will  not  be  computed. 
Otherwise,  NF  must  lie  in  the  range: 

2  NF  10. 
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33-35 


F3.2 


EIGN 


Eigenvalue  Co  serve  as  stop  criterion  for 
factor  analysis  of  tetrachoric  correlation 
matrix,  it  applicable.  If  an  eigenvalue 
falls  below  this  value,  no  further  factors 
are  extracted.  This  value  is  put  on  the 
card  as  three  digits,  the  last  two  of  which 
are  considered  to  be  after  the  decimal 
point.  For  example,  if  an  eigenvalue 
cutoff  of  1.00  is  desired,  it  should  be 
specified  on  the  card  as  100.  An 
eigenvalue  of  1.00  is  commonly  used. 

36  11  IDEND  0  if  identification  variable  (ID)  precedes 

responses  in  input  data. 

1  if  ID  follows  responses. 

37  II  IRWIND  0  if  input  data  tape  is  to  be  rewound 

before  processing  test;  1  otherwise. 

If  only  one  test  is  being  processed,  this 
option  is  irrevlevant.  If  the  same  cases 
are  to  be  used  as  were  used  in  the  previous 
test,  IRWIND  should  be  specified  as  0,  and 
the  input  format  should  pick  up  the  fields 
on  the  tape  that  correspond  to  the  present 
test . 
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38  11  IOUT  0  if  no  tape  or  card  output  is  requested. 

1  if  tape  output  is  requested. 

2  if  card  output  is  requested. 

The  output  will  consist  of  a  test  ID,  the 
case  ID,  score,  corrected  score,  external 
criterion  score,  and  1/0's  for  pass/fail 
for  each  item.  Any  non-applicable 
information  (such  as  the  external  criterion 
score  for  an  internal  criterion  test  run) 
will  be  written  zero. 

39-^2  At  ATEST  Test  ID  for  tape  or  card  output,  if 

applicable. 

t3  II  IALT  0  if  scores  are  not  to  be  corrected  for 

guessing;  or,  if  each  item  has  the  same 
number  of  response  choices,  and  this  number 
is  equal  to  1UPLM  as  specified  in  Col.  4. 

1  only  if  correction  for  guessing  is  called 
for;  and,  in  addition,  at  least  one  item 
does  not  have  the  number  of  response 
choices  equal  to  1UPLM;  if  this  is  the 
case,  additional  card(s)  will  be  necessary 
(card  no.  6) . 

II  ICARD  0  if  input  data  are  on  tape. 

1  if  input  data  are  on  cards. 
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45 


11 


I  HAM 


1  ol  scores  are  to  be  corrected  wild 


Hamilton's  (1950)  ionnula;  0,  otherwise. 
IALT  must  be  0;  ICORG'S  must  be  1. 

46-48  13  NERR  Maximum  number  ol  permissible  input  data 

errors  specified  by  user  (i.e.  data  do  not 
match  format  editing  code  type;  (ike 
reading  alphanumeric  with  an  1  lormat.)  II 
the  number  of  errors  equals  or  exceeds  this 
number,  the  program  will  terminate.  The 
case  number  and  ID  of  each  case  with  this 
type  error  will  be  printed.  (See  KERR 
Col.  58). 

49-52  14  NB1K  Blocksize  it  the  input  data  tile  on  unit  K1 

is  COBOL  (max  blocksize  =  1203). 

53-55  13  LRL  Logical  record  length  it  data  tile  on  unit 

KI  is  COBOL.  Both  this  field  and  the 
preceding  one  must  be  non-zero  it  tiie  tile 
is  in  COBOL  (max  LRL  =  250). 

56-57  12  KS2  Unit  ID  for  temporary  file  needed  when 

'Item  Alternative  Information  Roster'  is 
requested  (0  or  blank  indicates  the  above 
roster  is  not  requested). 

Tins  file  need  not  be  assigned  because  the 
system  will  assign  a  temporary  tile  ol 

su  t I ic  ient  size. 

It) 
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KERR 


Data  read  error  switch; 


0  =  system  (FORTRAN)  error  exit  routine, 

1  =  program  error  exit  routine  (see  also 
NERR,  Col.  A6-A8). 

NOTE:  The  system  error  exit  will  translate 
the  error  input  character (s )  into  zeroes 
and  print  the  system  error  message.  The 
case  is  retained.  The  IAP  program  error 
exit  will  print  the  error  case  (see  NERR) 
and  reject  the  case. 


Card  2  -  Title  Card 


Any  title  less  than  or  equal  to  72  characters,  starting  in  Col  1. 

Card  3  -  Input  Data  Format  Card(s) 

The  ID  will  be  read  in  "A"  format  with  a  field  width  of  not  more  than  six 
characters . 

Responses  to  items  with  numeric  answers  will  be  read  in  "I"  format. 

Responses  to  items  with  alphanumeric  answers  (multiple-response  items),  if 
any,  will  be  read  in  "A"  format. 

Skipped  fields  are  indicated  by  "X".  The  format  should  begin  with  a  left 
parenthesis  and  end  with  a  right  parenthesis.  If  more  that  one  card  is 
necessary,  simply  continue  the  format  on  additional  cards.  The  number  of 
format  cards  is  specified  on  Card  1,  Col  5.  Each  card  of  the  format  is  read 
through  Col  72  only. 
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As  mentioned  above  under  "Data  Specification,"  tne  ID  must  be  either  the 
first  or  the  last  word  read,  and  the  external  criterion  (if  any)  must  always 
precede  the  responses.  The  external  criterion  will  be  read  in  "F"  format. 

If  a  case  requires  more  than  one  record  (e.g.,  more  than  one  card)j  a  slash 
(/)  in  the  format  will  cause  tne  next  record  to  be  read. 

Examples : 


(  5X  ,  A6 
t  t 

5 

Skipped  id 


30X  ,  2011 

I  t 

30  20 
Skipped  Numeric 


X  ,  A1  ) 
f  t 

1  1 
Skipped  Alpha 


11 

X  , 

3H  , 

4A1 

50X  , 

311  /  2011 

A6 

I 

t 

t 

t 

t 

t 

t  t 

f 

1 

1 

3 

i\ 

50 

3 

.  20 

ID 

Numeric 

Skip 

Numeric 

Alpha 

Skip 

Numeric 

j  Numeric 

Skip  to 

* 

next  Record 

(X  , 

A6 

,  3X, 

Fb.2, 

5X, 

11, 

X, 

A 

2011) 

*• 

* 

t 

t 

t- 

t 

A 

1 

3 

Ext. 

5 

1 

. 

20 

Skip 

ID 

Skip 

Criterion 

Skip 

Numeric  Skip 

Numer ic 

20 


Here,  two  format  cards  are  required  to  read  one  record  since  the  format 
required  more  than  72  columns.  In  Example  2,  one  format  card  was  neeaed  to 
read  two  records. 


■-V  Caro(s)  t  -  Answer  Key  fir  Numeric  Items 

l 

The  first  three  columns  of  the  card(u)  are  not  read,  so  anything  may  be 
written  there  (such  as  "KEY"). 

Starting  in  Col.  n,  each  column  corresponds  to  an  item  specified  in  the 
"format”  statement,  excluding  the  ID  and  the  external  criterion.  For  numeric 
items,  tiie  correct  answer  should  be  specified  in  the  corresponding  column. 

For  alphanumeric  (multiple-response)  items,  a  "9"  should  be  specified  in  the 
corresponding  column.  (£  special  key  will  be  read  in  for  alphanumeric 
items).  F.acn  "answer  key"  card  contains  keys  for  up  to  77  items;  200  items 
Wiii  require  three  i-.inis. 

if  uesiren,  items  can  be  omitted  from  analysis  without  cnanging  t ne 
fi  rm.it.  This  is  done  ty  specifying  a  "0”  in  each  of  the  columns  corresponding 
to  those  items.  The  remaining  items  will  be  referred  to  in  the  output  by  the 
lime  numbers  that  they  would  he  without  any  0's  in  the  key. 
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Examples: 

KEY23^15H357 

This  is  a  10-item  test  in  which  all  responses  are  numeric.  The  correct 
answer  to  item  1  is  2,  and  the  correct  answer  to  item  10  is  7. 

KEY23^1000057 

This  is  a  key  for  the  same  test  as  before,  but  items  5  through  8  are 
removed  from  the  test  by  replacing  the  correct  alternative  with  zero.  The 
remaining  items  will  be  referred  to  as  before,  so  items  1,  2,  3.  9>  and  10 

are  listed  in  the  output. 

Note  that  the  same  thing  could  have  been  accomplished  by  changing  the 
format  to  "X"  out  items  5  through  8,  and  having  the  key  changed  to  KEY23^157. 
However,  the  items  would  now  be  referred  to  as  items  1,  2,  3,  5,  and  6. 

KEY1235H99221 

Here,  items  6  and  7  are  alphanumeric  (multiple-response),  and  the  correct 
answers  to  these  items  will  be  read  in  on  the  next  card. 

Card(s)  5  -  Answer  Key  for  Alphanumeric  Items 

This  card  is  optional  and  is  included  only  if  "ALPHA"  was  specified  as  1 
(Card  1,  Col  18). 

There  are  33  possible  multi-response  codes.  In  order  that  each  response 
occupy  only  one  character  in  the  data  file,  these  responses  are  coded 
alphanumerically ,  using  the  numbers  1  through  9  and  all  letters  of  the  alphabet 
except  W  and  Y.  These  characters  are  converted  to  integers  by  a  method  that 
i3  machine-dependent;  on  machines  other  than  the  Univac  1100/81  a  few  changes 
will  probably  be  necessary. 
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The  codes  for  each  alphanumeric  character  are  as  follows: 


Alphanumeric  Character  (Response)  Code 


1  1 

2  2 

3  3 

U  H 

5  5 

6  6 

7  7 

a  8 

9  9 


Alphanumeric  Character  (Response) 


Code 


Code 


23 

24 
2b 
26 
27 
2b 
24 

30 

31 

32 

33 

Any  other  response  wilL  be  considered  as  an  omit. 

Correction  lor  guessing  is  not  made  on  alphanumeric  items. 

The  character  "W"  is  reserved  lor  indicating  the  item  alter  the  last  item 
attempted  by  a  subject  on  a  speed  test  with  alphanumeric  responses. 

The  card(s)  are  prepared  as  follows:  Each  re sponse- i tern  combination  will 
occupy  six  card  columns.  The  first  three  columns  will  contain  the  iti.u, 
number;  tile  next  two  columns  will  contain  the  code  lor  the  response  as  given 
above;  and  the  last  coLumn  will  contain  the  number  ol  points  credit  to  be 
given  that  response.  Any  response  not  listed  will  receive  no  credit,  bp  to 
13  responses  may  be  listed  per  caid  (Cols  1-78).  11  more  tnan  13 

item-response  combinations  are  necessary,  continue  the  same  procedure  on 
subsequent  cards.  Each  item-response  combination  listed  must  immediately 
tollow  the  preceding.  The  six  columns  tollowing  the  last  combination  must 
contain  999999.  It  a  frequency  distribution  ol  scores  is  to  be  made,  the 
maximum  total  score  must  not  exceed  1000. 
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Example : 

Suppose  that  items  3  and  5  have  alphanumeric  responses,  and  that  each  one 
has  two  possible  responses  that  are  to  receive  credit — a  "2"  to  receive  1 
point,  and  a  "B"  (code  "11")  to  receive  2  points.  The  card  would  appear: 


003021 

003112 

003021 

003112 

999999 

<1> 

0) 

CD 

0 

l-l 

T3 

J-i 

U  X) 

CD  o 

0) 

O 

0  O 

< V  o 

CO 

tO  o  U 

O-u 

rH  O  LJ 

CD 

E  *H 

E 

•H 

E 

E  *rl 

3 

3  DT) 

3 

3  0) 

DOT) 

y:  <n  a; 

(0  CD 

S3  CO  G) 

^  co  a; 

C 

c  u 

C  i-. 

c  u 

C  V* 

E  O  w 

E 

ou 

E  OO 

E  O  CJ 

v£5 

V  CL 

a 

CL 

a)  a 

0)  a 

♦J  CO 

'J) 

U  CO 

4->  CO 

*-*  01 

Kh 

w 

>-i  0) 

hH  <U 

cC 

C* 

cc 

Card(s)  b  -  Alternate  Response  Cards 

This  card  is  optional  and  is  included  only  if  "IALT"  (Col  43  of  Card  1) 
was  "1". 

The  purpose  of  the  card  is  to  indicate  the  number  of  choices  for  each  item 
(it  any  different  from  "IUPL1M"  in  Col  4  of  Card  1)  so  that  the  proper 
correction  for  guessing  can  be  made. 

The  first  three  columns  are  not  read  and  may  contain  anything  (such  as 
"ALT").  Starting  in  Col  4,  each  column  contains  the  number  of  alternate 
response  choices  for  the  corresponding  item.  It  there  are  more  than  77  items, 
continue  on  a  second  card  (skipping  the  first  three  columns  again).  It  an 
item  is  alphanumeric ,  its  corresponding  column  may  be  left  blank,  or  may 
contain  any  integer,  since  it  is  not  used. 

Example : 


ALT444555333 

Here,  there  were  nine  items.  The  first  three  items  had  tour  alternatives; 
items  4  through  6  had  five  alternatives;  and  items  7  through  9  had  three 


al ternat ives . 
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Note  : 


Following  the  last  control  card  for  the  last  job  there  roust  be  a  card 
containing  ''999''  in  Cols  1-3. 

Card  -  Input  Sequence  (  in  reverse  order). 

Stop  (System  Cards) 


999  (  End  of  Job  Card  ) 

The  cards  in  brace  "A"  below  may  be  repeated 
for  any  successive  jobs. 


t 


Card  (or  tape  record)  containing  999999  in  ID  field. 
(  End  of  Run  Card) 


Data  cards  or  tape  input 


Optional  (Alternate  Response  Card)  (Card  6) 


Optional  (Answer  Key  for  Alphanumeric  Items)  (Card  5) 


Answer  Key  for  Numeric  Items  (Card  4) 
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IV.  SUMMARY 


This  report  describes  the  development  and  implementation  of  a 
state-of-the-art  computer-based  item  analysis  technique.  It  deviates  from 
traditional  techniques  by  providing  detailed  information  about  the 
characteristics  of  achievement  test  items,  particularly  the  ability  level  at 
which  a  given  item  discriminates  most  and  the  degree  of  discrimination.  Here 
discrimination  is  independent  of  item  difficulty,  unlike  traditional  methods 
where  the  discrimination  index  is  a  function  of  the  difficulty.  This  paper 
includes  all  information  necessary  for  potential  users  and  provides  all 
formulas  and  mathematical  derivations  upon  which  the  algorithm  is  based.  The 
computer  program  lias  been  written  in  FORTRAN  V  on  the  UNIVAC  1100/81  computer 
system  and  is  easily  convertible  to  other  systems.  An  exception  is  the 
plotting  subroutine  for  the  item  characteristic  curves.  This  subroutine  is 
written  in  COBOL  and  is  machine-specific.  However,  the  program  can  be  used 
without  the  plot-routine  since  one  of  the  options  provides  all  numerical 
information  about  the  item  characteristic  curves  and  permits  manual  graphing 
with  ease. 
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APPENDIX  A 


COMPUTATIONAL  FORMULAS  AND  MATHEMATICAL  DERIVATIONS 


1.  Raw  Score  -  number  of  correctly  answered  items. 


2.  Correction  for  Guessing  Formulas. 


a.  Standard  correction  (Guilford,  1965,  p.  489.) 


Corrected  Score  =^number  correct) — 

z 


i»l 


(K.  -  1) 


Where  K.  =  the  number  of  choices  for  item  i 

l  ' 


n  =  number  of  items  included  in  the  item  analysis, 
and  the  sum  is  taken  over  those  items  to  which  a  wrong  response  was  given. 


If  =  K  for  all  i,  then  this  reduces  to 


^number  correct)-  number  wrong 

K  -  1 


For  items  with  more  than  a  single  correct  answer  (multiple-response),  no 
correction  for  guessing  is  made. 
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IriLOLui.'G 


“d 


b.  Hamilton  Correction  (Hamilton,  1950) 


Corrected  Score  =  a+(b)*(raw  score) 

where  a  and  b  are  the  coefficients  of  the  linear  regression  of  the  corrected 
score  on  the  raw  score. 


R  W  -  n  <Qr* 
(k-t  )  6T* 


Sn<i 


where:  R  =  mean  number  of  questions  answered  correctly 

W  »  mean  number  of  incorrect  answers 

n  =  number  of  items  in  the  test 

k  =  number  of  alternatives  per  item 

0. 

FT  50  variance  of  the  raw  scores 
ur 

The  squared  correlation  coefficient  between  the  corrected  scores  and  raw 
scores  is: 

*  kff.-n  +  l 

T  =  * - 1 - r - 

NOTE:  For  multiple-response  items  (more  than  one  correct  answer),  no 

correction  for  guessing  is  made. 
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KJ 

3.  Mean  X  :  ^ 

^  i 

N 

where  N  -  the  number  of  scores  being 

i  ;  ‘ 

considered . 

i| .  Variance  = 

a 

6  = 

a  xj/n 

N  -I 


5.  Standard  Deviation 


6.  Standard  Error  of  the  Mean  ; 


Vn* 


7.  Skewness  = 


.  S3 


(N  -  2)  (N  -  1) 


where  S3  =  X3  -  3  (  :  X/N)  :  +2  (  '  X/N)  i!X 


H.  Kurtosis  =  Q>t  / 


wh<' re  gn  ; 


M 


(N-l)  (N-2)  (  N-3 ) 


(N  +  1)  (  '  X 4  -  14  (  y  X/N)  X  X3 


<>(  X/N)*’’  X£>  -  -(  X/N)3.  X)  -  H(N  -  n/N  ))  •  (XX2  -  (  i.  X)"  ) 

N 


ii 


(.5  +  .25 Q4/G4) 


o.  Standard  Error  of  the  Standard  Deviation 


It).  Standard  error  ot  Skewness 


■i 


6N (N-l ) _ 

(N-2) (N+l) (N+3) 


i 


11.  Standard  Error  ot  Kurtos is 


2 /■  N  ( Nr J_V 


(N-3) (N-2) (N+3) (N+5) 


1 2  .  z-score  =  X  -  X 

o 


13.  T  score  =  (z  score)  .  (10)  +  50 


la.  Item  Standard  Deviation 


where  p  -  th>  proper*  i  or.  ot  examinees  answering  the  item 


q  =  ttie  proportion  ot  examinees  answering  the  item 


win  re,  r.ec«  ssan  lv  q  =  1  -  p. 
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cor  re c  t  1  . 


i ncor  rec t  1  > 


i"',  Poirit  Biserial  Correlation  (between  item  and  total  test  score). 


Tp  b ;  s  = 


tv  - 1* 


where  M  is  the  mean  test  score  for  persons  answering  the  item 
P 

■■orrectly,  is  the  mean  test  score  lor  persons  answering  the  item  incorrectly. 


16.  biserial  correlation. 


M.  -  H 


\i%s 


3l_  . 

y 


where  y  is  the  ordinate  at  the  point  of  dicnotomy  in  a  standard  normal 
distribution  (see  2b,  below). 


t-tcst  to  test  the  significance  of  tne  correlation  coefficient. 


where  r  is  the  correlation  coefficient  and  the  resulting  t  has  (N  -  2) 
iogrees  of  freedom. 


16.  X50  - 


where  X  :  the  abcissa  value  at  the  point  of  dichotomy  in  a  standard  normal 
distribution  (sec  2b  below). 


X‘>0  specifies  the  id-score  on  the  (fitted)  item  characteristic  curve  at 
which  bO %  of  the  persons  naving  the  z-score  chose  the  correct  response.  The 
. I rm  characteristic  curve  is  a  cumulative  normal  ogive  fitted  to  the 
distribution  of  7. -scores  versus  the  proportion  passing  the  item  at  each 

•_:;eor>'  |  ^ve  1  . 

r> 


19. 


r 


b*’* 


measure  of  the  discrimination  power  of  the  item. 


In  non-technical  terms  |i  may  be  thought  of  as  the  slope  of  the  item 
characteristic  curve  at  X50.  Mathematically  it  is  the  inverse  of  the  staridar 
deviation  of  the  normal  (fitted)  ogive. 


20 .  Reliability  index  of  the  item  . 

RI*(V>is)-VTq  =  the  contribution  of  the  item  variance  to  the  total 
test  variance  (Culliksen,  1950,  pp.  375-378). 


21.  Kuder  Richardson  Formula  20  (test  reliability). 


wnore  n  -  number  of  test  items. 

2 

:  variance  of  scores  on  test, 

p^  :  proportion  of  examinees  passing  item  i  (difficulty  of  item), 


q.  :  proportion  of  examinees  failing  item  i  where  q.  =  ]  -  p 
1  i  i 


Phi  Coefficient  = 


BC  -  AD 


yr»  +  B)  (C  +  D)  (A  + 


C)  ( B  +  D ) 
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To  be  more  specific,  the  procedure  is: 


(1)  Determine,  for  each  variable,  what  the  point  of  dichotomy  should  be 
in  a  standard  normal  distribution  to  produce  the  observed  proportions 
above  and  below  the  dichotomy  for  that  variable.  This  is  simply  the 
inverse  function  for  the  normal  distribution,  whose  computation  is 
described  above. 

(2)  Determine  what  correlation  in  a  bivariate  normal  distribution  will 
give  the  observed  proportions  in  the  four  regions  described  by  the 
four-fold  table.  This  involves  two  problems: 

a.  The  bivariate  normal  distribution  must  be  represented  as  a 
function  of  r,  the  correlation,  and  equated  to  the  observed 
proportion  in  a  given  region. 

b.  This  equation  must  be  solved  for  r.  An  iteration  scheme  must  be 
used,  since  the  bivariate  normal  distribution  is  an  integral  that 
must  be  computed  numerically  or  written  in  a  series  expansion;  if 
written  as  a  series  expansion,  a  polynomial  equation  of  high  degree 
must  be  solved,  which  requires  iteration. 

The  usual  approach  is  to  use  the  series  expansion  and  solve  by  iteration. 
However,  for  even  a  moderately  large  r,  the  series  coverges  very  slowly;  and 
for  each  iteration,  it  must  be  recomputed. 
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VJIi.it  is  needed  then  is: 

(1)  A  hotter  method  for  computing  the  bivariate  normal  distribution. 

(2)  A  scheme  requiring  the  fewest  possible  iterations,  since  the  slowest 
part  of  the  computation  is  the  bivariate  normal. 

The  method  used  for  computing  the  bivariate  was  based  on  the  T-funct ion  of 

—8 

Owen  (1956,  p.  1075).  The  error  in  the  program  is  less  than  5X10  for  all 
correlations  and  upper  limits. 

The  equation  was  solved  in  a  manner  similar  to  Newton's  (Acton,  1970) 
method,  but  with  higher  order  terms  included.  This  was  done  because  the 
higher  order  derivatives  can  be  obtained  very  simply,  and  are  much  cheaper 
than  further  iterations.  (It  was  found  that,  using  the  Cosine-Pi  formula  of 
Pearson  as  a  starting  approximation,  usually  only  one  iteration  was  necessary 
t.o  produce  the  maximum  available  accuracy,  and  at  most  two.) 
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The  series  was  developed  as  rollows: 


K  u. 


Le4  ~5l>  (h  ,U,  r)  = 


i _  r 


,  /  x^y^a-yxr  \ 
x\  »  -  r*  y 


cixdU 


-  «D  -«0 


It  B  is  d  i  t  terent  iateci  with  respect  to  r,  then  the  coublt-  integration  can  be 
per  termed,  leaving 


j  I  \n%-U'—  iWUr \ 
“M  i*rl  ) 


—  =  — -  P 

cir  I'm  Y\  ^ 


There  fore : 


Let 


*!  /ViVk-l WKr>  \ 
i  1  *  V  i-r*  ) 

ab  V 

*U£±!^) 

z(0-- e. 


Then 

a_r 

aB 


can  be  rewritten  as  tollows: 


ar 

aB 


=  ^(r)  (i-rx)x 


Furt  hermorc 


=  Z^[(L\W'-lV>lcr)(»-X'-r")  -Lk-rj 


a_/ciV 

cl  s>  [dr 


<\  r 

ft 


dr 

dft 


2  z 


riz 

c\  r 


(V+k2-Zkh,)(t-)(|-r!)-U  -  r 


2 

+  z 


( 


h  + 


-I 

2hkr)(l-r'i) 


—  2.  hkr(  I  -  r 2)  +•  2r1(l-rx)  (  h\ kl-  2  k  k r  )  -  1 


=  Z  (r) 


( <HX k1-  2hkr)(»')(l -r1)  | 


j(hW-  2hkr)(r)(  I-  rx) 2hk  j 
^  ^  h*  v  k  *  -  2hkr  )  *-  h 1  k  -r  ^  I  -  r1^ 


Bv  ass  i  .ninj.’,  constants  as  follows,  those  three  terms  can  he  further  simnlified. 


4  1 


Let 


v-  hk 


1  -  r 


i- .  v^T 


c. 


2C 


0 


2  2 

Cti  h  ♦  k  -  C3r 

S  :  VC1 

C6  ®  rC5 


(2ujexp  (i  C  ) 


C6  '  C0 


D3  :  2  [c6  (C6  -  C3)  +  |  C4  +  Co]/C?  '  C2 


Then: 


dr 
dB  " 


C  Z 

a 


D  Z  ‘ 

2 


3 

D  Z 

3 


No  attempt  was  made  to  develop  higher  order  terms,  because  three  seemed  to 
give  maximum  accuracy  with  only  one  iteration  in  most  cases;  higher  terms 
could  be  generated  in  a  straightforward  manner,  with  considerable  labor. 


42 


A  I  ir.sL  approximation  is  made  with  Pearson's  Cosine-Pi  formula.  With  this 
r,  the  proportion  In  any  one  of  the  four  regions  is  computed  using  the  scheme 
mentioned  above.  The  difference  between  this  proportion  and  the  desired  one 
L:i  t.ne  region  is  computed,  called,  say,  i\  B. 

Let  X  -  Z(  B).  Then  r  is  corrected  as  follows: 

r  corrected  ;  r  +  X  £  +  x  (l/PD^  +  1/6  D^x 

If  one  iteration  does  not  produce  agreement  to  within  5 xlO~°  in  the 
proportion,  another  iteration  is  performed.  In  this  way,  the  desi'eo 
correlation  is  reached. 

The  accuracy  obtained  in  the  correlation  itself  varies.  For  certain 
distributions  ( correlations  very  nearly  1.0,  or  very  large  or  small  h  and  k), 
dB/ur  is  nearly  zero,  so  that  a  small  error  in  the  proportion  corresponds  to  a 

Large  error  in  the  correlation.  However,  the  correlation  given  by  the  program 

-8 

does  reproduce  the  four-fold  table  with  an  error  of  not  more  than  5xl0~  , 
which  is  a  reasonable  measure  of  the  accuracy  of  the  correlation.  It  should 
also  be  noted  that  in  the  exceptional  ranges  n.  oned ,  a  small  error  in  the 
input  causes  a  very  large  error  in  the  correlation,  making  it  highly 
unreliable.  (These  are  the  cases  where  the  standard  error  is  largest,  for  a 
g  iven  X .  ) 
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24.  If  there  is  an  outside  criterion,  a  validity  coefficient  is  computed 
which  is  Pearson's  r. 


where 


z 


r  :  correlation  between  X  and  Y. 
xy 

X.  :  internal  criterion  scores. 

1 

X  r  mean  of  X  values- 

Y^  :  external  criterion  scores. 

Y  =  mean  of  Y  values. 

0  =  standard  deviation  of  the  distribution  of  X  scores. 

Y  i  standard  deviation  of  the  distribution  of  Y  scores. 


25-  Calculation  of  abcissa  and  ordinate  of  standard  normal  distribution  at 
dichotomy. 

The  abcissa  (X)  and  the  proportion  passing  item  (p)  are  related  as  follows 


) 

x 


The  ordinate  (y)  is  then  computed  as: 


1 


t  2. 

x  X 


e 
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x  is  obtained  in  the  manner  discribed  in  26  below. 


2b. 


Computation  ot  the  Inverse  Normal  Distribution  Function. 

The  inverse  normal  (cumulative)  distribution  function  x(q)  is  defined  by 
the  equation 

00 

*  “Y=rj  i  0<‘*<i 

However,  since  x(l  -  q)-  -x(q),  only  the  0<  q  <_.5  range  is  necessary  to 
be  considered. 

Hastings  (1964)  gives  a  min-max  rational  approximation  to  x(q)  which  has  a 
maximum  error  ot  4.5  x  10  over  the  range  0-  q^.,5.  Since  greater 
accuracy  was  desired,  the  following  approach  was  taken. 
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(a.)  Obtain  the  derivatives  of  x(q)  as  follows: 


Higher  order  terms  are  generated  in  a  similar  manner. 

(b.)  Obtain  an  initial  approximation  of  by  the  Hastings  formula 
(26.2.23), 

(c.)  Compute  the  error  in  q(x),  say  /\q  :  q(x^)  -  q.  Most  Fortran 
compilers  have  the  Error  Function  available  from  which  q(x^)  can  he 
obtained.  For  large  x^,  a  Gaussian  continued  fraction  can  be  used  to 
arrive  at  ^ q . 
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(a.)  Write  the  correction  terms  as  a  Taylor  series  for  x  in  terms  of  /  q, 
expanding  about  q(x^)  (which  is  the  q  of  the  initial  approximaticfo ) , 

10"-8  q  _  .5.  The  lower  limit  (10~j8)  is  the  smallest  number  on  the 

TBM  7OH0  computer  on  which  this  program  was  developed. 

It  was  found  that  only  the  first  two  terms  of  the  Taylor  series  were 

required  to  attain  desirable  precision  on  an  eight-digit  machine,  yielding 

an  error  of  the  magnitude  of  5x10  .  Using  the  first  five  terms  of  the 

-15 

series  resulted  in  an  error  of  less  than  10 

The  first  term  in  the  series  is  equivalent  to  a  single  iteration  by 
Newton's  method.  The  reason  ror  using  higher  order  correction  terms  in 
lieu  of  further  interations  is  that  the  former  can  be  obtained  very 
quickly  from  the  first-order  term,  whereas  additional  iterations  would 
require  evaluation  of  q(x)  and  dx/dq  for  each  iteration. 

A  graphical  illustration  may  be  helpful  to  understand  the  algorithm. 

Figure  A -1  shows  the  initial  approximation  x^,  with  the  associated 

the  derived  q(x^.)  associated  with  the  final  approximation  x^., 
arid  the  error  (or  difference)  between  the  initial  and  final  approximation 
(  x  and  q ) . 
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The  final 
follows : 

Let 


Then 


approximation,  x  ,  correct  to  eight 

P 


z  - 


Xf  =  x  x  ( |  +  ± 


8 


r 


For  accuracy  the  final  approximation,  x^,,  takes  the  form  of 


(i  +  Z  (  +  z  ((1  +  2xa)/6  +  Z  (x(7-ft>:c’)/24 


+  Z  (7  +  x2(46  +  24x2)}  /120) ) ) ) 


27.  Correction  for  overlap  in  biserial  correlation  when  item  score 
contributes  to  criterion  score  (internal  criterion). 

Guilford's  Method: 


rh  •  6*  -  p  ^  /  y 

corrected  rb^g  -  -  ■  . .  —  - - -  - 

•  1S  (p^V)-  2rb>?  (P'hV) 


where  =  standard  deviation  of  test. 

p,q  are  the  same  as  in  14. 
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Henrysson's  Method: 


corrected  r.  . 

bis 


cr  -  p^v  /W 

"V^a+p<r  -lr^(Ty 


28.  Correction  for  overlap  in  point  biserial  correlation. 


corrected  r  . 

pbis 


Tph.  i  (S’  —  ''TpiJ- 


2 9-  More  on  the  Item  Characteristic  Curve 

The  information  given  about  the  item  variable  is  a  dichotomy  since  the  item 
score  is  either  pass  or  fail.  To  justify  the  statements  about  the  shape  of 
the  item  characteristic  curve  (i.e.,  ogive)  and  the  formulas  used  for 
estimating  its  parameters,  we  have  to  make  certain  basic  assumptions: 

(1)  The  item  variable  is  continuous  even  though  we  know  only  dichotomous 
information  about  it.  That  is  individuals  know  various  amounts  of 
information  about  the  item,  a  certain  amount  of  which  is  necessary  to 
fall  into  the  pass/fail  dichotomy. 

(2)  The  regression  of  the  item  variable  on  the  criterion  variable  is 
linear. 


r>0 


4 


(?)  The  conditional  distribution  of  trie  item  variable  (i.e.,  the 

distribution  of  the  item  variable  for  a  given  criterion  score)  is 
normal,  with  a  variance  independent  of  the  criterion 


The  following  discussion  assumes  that  both  the  criterion  variable  and  the  item 
variable  are  in  standard  scores  (i.e.,  deviations  from  the  mean  in  standard 
deviation  units).  This  means  that  the  regression  line  passes  through  the 
origin, has  a  slope  of  r,  where  r  is  the  correlation  between  the  two  variables, 
and  the  variance  of  the  conditional  distribution  of  the  item  variable  is 
( 1-r  '] .  Figure  A -2  is  a  graphic  representation  of  the  situation  where 

p  ;  proportion  of  individuals  passing  the  item 

q  s  (1  -  p)  i  proportion  of  individuals  failing  the  item 

c  cutoff  point  on  the  item  variable  corresponding  to  the  pass/fail 
dichotomy 

Trie  abscissa  represents  the  criterion  variable  in  standard  scores  (x-axis). 

The  ordinate  represents  the  item  variable  in  standard  scores  (y-axis).  p(x)  = 
proportion  of  individuals  passing  the  item  for  a  given  criterion  score  x 


(shaded  areas  in  Figure  A -2). 


FIGURE  A -2 


The  conditional  distributions  have  means  on  the  line  y  =  rx  since 


linearity  of  regression  was  assumed. 


('um:;  j  iji-r  now  j .  (  x  ;  ,  the  proport,  ion  above  the  cutoff  point  C  in  trie 
conditional  distribution  varies  with  x.  Since  the  conditional  distribution 
in  norrul ,  its  variance  is  1  -  r‘  (see  Proof  1)  and  furthermore  tnis 
variance  is  independent  from  x.  Howeve^,  as  x  increases,  the  distance  between 
tne  mean  of  the  conditional  distributions  and  the  cutoff  point  C  increases  as 
well..  In  fact  this  distance  is  (C  -  rx).  Since  the  displacement  of  the  mean 
from  trie  cutoff  is  a  linear  function  of  x,  trie  proportion  above  the  cutoff  in 
the  conditional  distribution  produces  a  normal  ogive  (cumulative  normal 
distribution)  witri  respect  to  x. 

Proof  1  To  prove  that  the  variance  of  trie  conditional  normal  distribution 

2 

equals  (1  -  r  ) . 

By  hypothesis,  tne  variance  of  the  conditional  distribution  is  constant, 
say  '  .  Also  by  hypothesis,  the  mean  of  the  conditional  distribution  is 
( rx) .  Therefore 


uo 

r 

( 1 )  (y  -  rx) r  f  (y  (  x)dy 

/ 

-  >XJ 


where  t'(y|x)  ;  conditional  density  of  y 

Also  by  hypothesis  (normal  distribution),  the  means  of  y  and  x  are  zero 

mu  their  v  iriance  is  1. 
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Thus : 


OU  JL» 


(2) 


(I 


y  g(x)f(y |x)dydx  =.  1 


—  OO  -  Xi 


Variance  of  y 


where  g(x)  =  density  of  x 


and  g(x)f(ylx)  :  joint  density  of  y  and  x 


from  (1)  above 


JO 

[■ 


•ji 

( 


JO 


y^f(y|x)dy  -  2rx  \yf(y|x)dy  +  r2x2  \  f(ylx)dy  s 


—  'JO 


_oo 


Collecting  terms  and  realizing  that 


f 


■  (■ 


yf(y I x)dy  =  rx,  \y  f(ylx)dy  =  2r  x-r  x2  +  0  =0  +  r  x 


Substituting  into  (2)  results  in: 


ao  jd 


Since 


y2f (ylx)dydx  = ((  ,  2  +  r2x2)g(x)dx  =  2[g(x)dx+r2lx2g(x)dx  = 1 


fjgUJdx+r2^ 

*  -  OO 


g( x )dx  r  1 


density  of  x 


-  <*> 


and 
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\V» 


variance.*  <>l  >:  by  hypothesis 


j  x‘  p(  x  )dx  :  I 

--  sJ o 


we  have 


,‘(1)  +  r^(l)  =  1 


2  2 
.  +  r  .  1 


Returning  to  Figure  A-2,  at  the  point  where  the  cutoff  line  y  +  C 
intersects  the  regression  line  y  ;  rx,  the  value  of  x  is  called  X50.  At  this 
point  as  it  can  he  seen  from  Figure  A-2,  h0%  oi  the  conditional  distribution 
lulls  above  and  '>()  tails  below  (in-  eutoll  since  the  mean  of  the  conditional 

distribution  coincides  with  trie  cutoff  at  this  point.  Since  C-  r(X 50)  at 
this  point,  X50 z__C/r  . 


XbO  is  also  the  inflection  point,  for  trie  item  characteristic  curve  (the 
curve  of  p(x)  plotted  against  trie  criterion  score).  This  curve  is  a  normal 
ogive  aria  baa  a  standard  deviation  of 

=  £Z1 

r 


r>s 


1 


Proof  2  The  conditional  density  of  the  item  variable  y  is 


’P(yu')=  T/irt/rr^  e 


>  (H  -  ttrt* 

-  2  i  - 


The  proportion  above  the  cutoff  is 

T  -  *  (y 

p (>o  -  r-  f  e  *“ 
Vxtr  ( i-  rxj  j 

c 

let  t  5  y/r  -  x 


l  -  r 


dy  i  rdt 


then 


ou 


,  _ il 

Z 


P(*>  =  /::  — «=T  e  O-rVr*  dt 


which  is  a  normal  ogive  with  standard  deviation  of 


and  inflection  point  of  c/r. 


The  reciprocal  of  the  standard  deviation  of  the  normal  ogive  is  called 


^and  g 
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There  are  numerous  methods  of  estimating  the  parameters  c  and  r  from  a  given 
sample.  Probably  the  best  is  the  maximum  likelihood  method  by  which  c  and  r, 
the  estimates  of  c  and  r,  are  chosen  in  such  a  way  as  to  maximize  the 
prohahil  it v  oi  occurrence  of  the  sample  data  at  hand,  with  the  hypothesized 
probability  d istr ihut ion  depending  only  on  these  two  parameters.  However, 
this  method  leads  to  non-linear  simultaneous  equations  which  must  be  solved 
iteratively  with  considerable  labor  at  each  step.  A  far  simpler  method  for 
estimating  r  is  by  use  of  tne  biserial  correlation.  This  method,  however, 
requires  two  additional  assumptions,  namely  that  the  regression  of  the 
criterion  variable  on  the  item  variable  is  normal.  The  formula  can  be  arrived 
at  as  follows: 

i.et  f(x|y)  be  the  conditional  density  x.  The  marginal  distribution  of  y  is 
hypothesized  to  he  standard  normal,  so  the  density  ol  y  is 


,  x 


The  assumption  of  linearity  of  regression  of  x  and  y  means  that 


xf (x  j  y )dx  :  ry 


.o 
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that  is,  the  mean  of  the  conditional  distribution  of  x  falls  on  tne  line 

ry. 


Now  consider  X  :  mean  criterion  value  for  cases  above  tne  cutoff  line  yt 
P 


om  <ju 

j1  ^xf (x| y)g(y )dxdy 


c  •«*» 


r 

C 


f (x|y)g(y)dxay 


where  f(x|y)g(y)  is  the  joint  density  of  x  and  y. 

a  JU 

^  g(y)  \  xf(xly)dxay 

—  c  _  Jo 


|t?(y)  j*  f(x|y)dxuy 

c  -■» 


Now 


xf ( x |y )dx  ;  ry 


—  ao 

by  hypothesis  and 

JO 


( 


f(x| y)dx  -  1 


-  JO 


therefore  X 


w 

r  ^yg(y)dy 

e 


g(y  )dy 
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t'Ut 


thus 


X 


P 


hence 


r 


Let 


x 


where  c  is  simply  the  abscissa  corresponding  to  a  proportion  P  in  a 
standard  normal  distribution,  since  the  item  variable  was  assumed  to 
be  normally  distributed  . 


Rewriting  Xp  in  terms  of  raw  scores, 


r  * 


JP 

x 


Xp  * 


~  MP  -  M: 


P 

T 


which  is  the  formula  for  the  biserial  correlation. 


Thi3  relationship  is  exact  for  the  population  but  it  is  only  an  estimate 
when  written  in  terms  of  sample  values.  It  is  not  as  efficient  as  the  maximum 
likelihood  method,  particularly  for  a  large  r;  in  fact  for  a  large  r,  it  is 
a  rather  poor  estimator. 

It  is  possible  to  incorporate  the  maximum  likelihood  method  in  this  item 
analysis  package;  however,  core  limitations  of  the  computer  (IBM  7040)  for 
which  this  program  was  originally  developed  would  have  made  the  attempt 
impractical  with  an  added  costly  time  factor.  See  also  Ree  (1979). 
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30.  Average  Item  Difficult 


The  percent  of  individuals  passing  a  certain  item  is  converted  to  a 
standard  difficulty  (D): 

D  0.6  -  0.16147653X 

where  x  is  the  abscissa  correspond ing  to  the  proportion  of  individuals 
passing  the  item  (upper  portion  of  the  normal  distribution). 

To  convert  back  from  standard  difficulty  to  the  corresponding  proportion 


Proportion 


JKj 


* 


where  x  s  (.5  -  0)7.16147553 


Averaging  is  done  with  standard  difficulties  and  then  converted  back  to 


proportion  passing. 


A 

proportion 

Of  .999 

converts 

to 

a 

D  : 

.999 

tl 

tt 

"  .001 

II 

tt 

tt 

D  = 

.001 

It 

It 

"  .  500 

tt 

tl 

tt 

D  : 

.500 

It 

tt 

"  <.001 

tt 

ft 

11 

D  = 

0 

If 

ft 

”  ".999 

tt 

If 

tt 

D  = 

1 

31. 


Mean  of  the 


.  th 
J 


alternative  of  item  i 


where  IF  =  N  if  the  test  was  a  power  test  and 


=  the  number  of  people  reaching  item  i  if  the  test  was  a 
speed  test. 


32.  Standai  Deviation  of  the  ith  alternative  of  item  i 

33.  Correlation  between  alternatives  a  and  b  If  item  i 


ifiT  xT 

r"  —  tf  ucv  -p 

“  _  I  (•-*:» 

(Formulas  31  and  32  are  derived  from  the  regular  formulas  by  noting  that 
the  individual  vaLues  of  the  alternatives  are  l's  and  0's  only  and  that 
the  alternatives  are  mutually  exclusive.) 


b2 


34.  Point-Biserial  correlation  between  the  criterion  Y  and  the  jth 

alternatives  of  item  i 


N 


r 

pb  *  iv 


^  X-jl  Yk/n.  -  Y, 


(Note  that  in  a  speed  test  Y^  and  SDyi  are  based  on  the  ff  reaching 
item  i;  in  a  power  test,  they  are  computed  on  the  full  sample). 

35.  Biserial  correlation  between  the  criterion  Y  and  the  jth  alternative 

of  item  i 


r 

h.s  4  V 


SB, 


JA. 


where  Z  =  the  ordinate  of  the  unit  normal  distribution  curve  with  area 

equal  to  1.0,  at  the  point  of  division  between  segments  containing  p  and  q 

(X.  .  arid  1-X.  .)  proportions  of  the  cases, 
ij  1J 
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RTEST: 


•ial  validity  significance  test  value 


r  S 

the  significance  test  is  bis  jy 


then  : 


P  i  -05  if  RTEST  >  1.96 
P  £  .01  if  PTEST  2  2. 576 
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APPENDIX  B 

IAP  Sample  Run 


65 
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ITEM  ALTERNATIVE  INFORMATION  ROSTER  (  BASED  ON  'CORRECTED  TEST  SCORE'  CRITERION) 
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